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Abstract — Matrix Factorization techniques have been suc- 
cessfully applied to raise the quality of suggestions generated 
by Collaborative Filtering Systems (CFSs). Traditional CFSs 
based on Matrix Factorization operate on the ratings provided 
by users and have been recently extended to incorporate 
demographic aspects such as age and gender. In this paper 
we propose to merge CFS based on Matrix Factorization and 
information regarding social friendships in order to provide 
users with more accurate suggestions and rankings on items of 
their interest. The proposed approach has been evaluated on a 
real-life online social network; the experimental results show 
an improvement against existing CFSs. A detailed comparison 
with related literature is also present. 

feyHwrfs-Collaborative Filtering, Recommender Systems, 
Social Networks, Matrix Factorization 

I. Introduction 

The term Collaborative Filtering (CF) refers to a wide 
range of algorithmic techniques targeted at learning users' 
preferences to recommend them items (like commercial 
products or movies) that are potentially relevant to their 
tastes [1|. CF techniques are an effective tool to support Web 
users in finding contents of their interest and, at the same 
time, to limit the amount of (often useless) information they 
receive when looking for recommendations. 

CF techniques assume that users are allowed to rate 
available items over some scale and that the ratings are 
stored in a user-rating matrix R. Generally, CF techniques 
work by comparing the ratings provided by users. Users who 
in the past have given similar ratings to the same objects can 
be used to give reliable estimates of a user's experience. The 
rating a user u x would assign to an item ij is predicted by 
considering the ratings to ij provided by the users who are 
most similar to u x in terms of ratings. 

In the latest years, many researchers have been sought to 
improve the accuracy of CF techniques. In this scenario, 
Non-Negative Matrix Factorization techniques (in short, 
MF) have been largely and successfully applied to CF 

us, a, na. 

MF techniques take the user-rating matrix R as input and 
factorize it into the product of two matrices P and Q in such 
a way as to any row of P (resp., Q) is associated with a user 
(resp., an item). Such a mapping has a nice interpretation 
that can be clarified through an example: assume to consider 



the movie domain and assume that each user is allowed 
to rate movies. MF techniques map the space of users 
(resp., movies) onto a low dimensionality space in which 
any dimension can be conceptually interpreted as a movie 
genre (like "drama" or "comedy"). 

Once such a mapping has been performed, each user is 
identified by a vector whose components specify how much 
she is interested in a given movie genre. 

This has beneficial effects to raise the quality of produced 
recommendations. For instance, let us consider two users u% 
and U2 and assume that both of them are strongly interested 
only in adventure movies and dislike other genres; finally 
assume that none of the movies watched by u\ have been 
watched by U2 and vice versa (even though this setting could 
appear unrealistic, in real cases it is quite common). 

In such a case, the comparison of the rating histories of u\ 
and i*2 would not be effective: in fact, since the overlap of 
the movies highly rated by Ui and U2 is empty, the movies 
liked by Ui are deemed as not relevant to U2 and vice versa. 

Such a conclusion is, of course, counterintuitive because 
both u\ and U2 are interested to adventure movies and, 
therefore, it is likely that some of the movies liked by u\ 
could be of interest to 112, and vice versa. By contrast, if we 
would use MF techniques, we would compare two users on 
the basis of the genres they like, rather than on their past 
rating histories and, therefore, we would not incur in the 
mistakes described above. 

Original MF techniques consider only user ratings and, 
after this, they have been extended in a range of directions: 
for instance, the approach of [3| considers different rating 
styles (i.e., the fact that some users tend to be more generous 
than others in assigning ratings) whereas other approaches 
incorporate demographic information about users, like gen- 
der or age (6). Experimental trials show that the usage of 
auxiliary information is useful to produce more and more 
accurate recommendations 0, |6). 

In our opinion, the approaches mentioned above could be 
extended in such a way as to consider the social nature of the 
Web and the thick fabric of social relationships among Web 
users. In fact, the emergence of collaborative platforms like 
Facebook or Flickr encouraged users to socialize in multiple 
ways: for instance, users join popular social networks like 



Facebook and spend the most of their time by interacting 
with their friends, sharing contents like photos or videos, 
and posting comments/reviews. 

We believe that social relationships can provide an ef- 
fective tool to raise the accuracy of the recommendation 
process. This corresponds to an intuitive idea: in real life, 
people often ask advices to their friends to take a decision 
and these advices play a crucial role in their final decisions. 

The process of requiring the help of other people to pro- 
duce recommendations is not new in Computer Science. In 
fact, several authors introduced the concept of trust between 
users and suggested to use trust values in conjunction with 
CF techniques to generate suggestions [7|, |8|, [9|. 

In this paper we propose to consider social relationships 
between users in conjunction with user ratings to generate 
recommendations. Our approach relies on MF techniques, 
but introduces some novel contributions. 

First of all, differently from traditional approaches, our 
approach merges both information about user ratings and 
information on social relationships. In detail, our approach 
to learning the latent space relies on the idea that if two 
users are tied by a social relations like friendship, then they 
should be mapped onto "close" vectors in the latent space. 
This has a relevant consequence: opinions/ratings of friends 
of a user u x will be more influential than opinions/ratings 
provided by users who are unknown to u x . 

As a further contribution, our approach suggests to use 
social relationships instead of trust ones. This provides 
three main novelties: the first is that trust relationships are 
generally asymmetric, in the sense that if a user trusts 
another one, the opposite may not hold true. By contrast, 
social relationships can be both symmetric (e.g., friendship 
relations) and asymmetric (e.g., a user u x who posts a 
comment on the blog of a user u y but u y never replied 
u x ). As a second novelty, users are required to explicitly 
declare what users they trust: for instance, in a system like 
SlashdoQ users are allowed to declare their friends and foes. 
As for social relationships, we can use explicit information 
provided by the users but we can also unobtrusively monitor 
user behaviors to learn her preferences and her relationships 
with others. For instance, we can analyze the comments 
posted by multiple users on a given subject (like the review 
of a commercial product) to learn the personal opinion of a 
particular user as well as to identify pairs of users showing 
divergent (resp., convergent) opinions. 

Finally, as for the third novelty, we observe that the 
definition of trust relationships relies on the fact that a user 
explicitly assumes that her interactions with another one are 
beneficial to her. By contrast, in social relationships, there 
are many reasons driving two users to interact and some of 
them do not necessarily imply that a user get some benefit 
from another one: for instance, a user could get in touch 
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with another only to expand her knowledge on a topic. 

The plan of the paper is as follows: in Section Q]] we 
review MF techniques for Collaborative Filtering. In Section 
irniwe describe our approach. In Section HVl we illustrate the 
experiments we carried out to validate our technique. Related 
literature is discussed in Section [V] Finally, in Section [VI] 
we draw our conclusions. 

II. Matrix Factorization for Collaborative 
Filtering 

In the latest years, many researchers proposed to apply 
dimensionality reduction (DR) techniques to increase the 
accuracy of Collaborative Filtering (CF) techniques. 

In detail, a CF algorithm operates on a matrix R called 
user-rating matrix. The R matrix has n u x rii entries, being 
n u (resp., rii) the number of users (resp., items); its generic 
entry R X j is the rating the user u x applied on the item ij . We 
set H X j — "*" if the user u x has not rated ij . The matrix R 
is quite sparse because in real-life domains, the number of 
available items is usually very large in comparison with the 
number of items typically evaluated by a user. Data sparsity 
is one of the major drawbacks plaguing CF algorithms; in 
fact, it negatively affects the computation of similarities 
between users and items and this, ultimately, yields poor 
results in predicting unknown ratings |10|. 

DR techniques have proven to be effective in fighting 
against data sparsity [10|. The key idea of DR techniques 
is to map the user-rating matrix onto a latent space of 
dimension k, being k a fixed integer. Due to this mapping, 
an item ij is represented by an item vector qj £ R fc and a 
user u x is associated with a user vector G R fe . 

A dimension in the latent space identifies a feature de- 
scribing an item or a user; for instance, if we consider the 
movie domain, the dimensions may be interpreted as the 
genres of a movie. For a given item ij, the generic entry 
q_j[£], < £ < k of the item vector specifies whether i 
possesses the £-th factor and the strength of this possession. 
An analogous interpretation holds for user vectors. 

To better clarify these concepts, let us consider again the 
movie domain; a possible latent space associated with it 
would consist of movie genres like "comedy," "adventure," 
"romance" and "horror." An item ij could be represented as 
q,: = [0.5, 0.7, 0, 0] specifying that the genre of ij is a blend 
of "comedy" and "adventure." 

In the latent space, the approval rating of a user u x to an 
item ij is computed as the inner product p x ■ qj. 

A possible approach to learning the latent space is given 
by Singular Value Decomposition (SVD). The SVD approx- 
imates R by means of its eigenvalues and it has been 
widely and successfully applied in the Information Retrieval 
context ifTTI . fl2l . Unfortunately, in the context of CF, the 
matrix R is not only sparse but many of its entries are 
also unspecified, i.e., there are a lot of entries labeled with 
the symbol "*" because many users may not be aware 



about the existence of an item and, therefore, we can not 
conclude that she likes/dislikes the item itself. In such a 
configuration, extensive experimental trials show that SVD 
is able to achieve poor performance (4). 

Recently, many researchers suggested to apply matrix 
factorization techniques to CF systems (4|, 0, Q, (6). In 
this case, the computation of the latent space requires to 
solve a suitable optimization problem. 

More formally, let P (resp., Q) be a n u x k (resp., x k) 
matrix such that the x-th row p x of P represents the vector 
associated with the user u x ; in an analogous fashion, the 
j-th row qj of Q represents the vector associated with the 
item ij . The optimization problem to solve is 

min C = mini||R-PQ T ||2, + ^ (||p||* + ||Q|| 2 F ) (1) 

Here Q T is the transpose of the matrix Q whereas the 
symbol || • ||^ denotes the Frobenius norm of a matrix^. 
Finally, the term -| (||P|||, + ||Q|| F ) is known as Tikhonov 
regularization and it is used to avoid overfitting. The param- 
eter A is usually computed by applying cross-validation. 

A popular strategy to optimize the function C is based 
on the so called gradient descent method ||2) . To implement 
such a strategy, we must consider only the ratings actually 
provided by the users and ignore missing entries in R. To 
this purpose, if we set 6 X j = 1 if u x has rated ij and 
otherwise, the optimization problem to solve is 

min 2 £ 6 *i - P* • * f + 2 ( ||P|1 ^ + IIQI1 ^) 

x=l j=l 

(2) 

The gradient descent procedure consists of four steps: 

1) The vectors and are initialized at random. 

2) The partial derivatives of C are computed. 

3) The vectors p x and are updated in the direction 
opposite to the partial derivatives 

dc , dc 
Px = Px - p — q 3 = q? P — 

Pa; Qj 

Here p is a threshold to be determined. 

4) Steps 2-3 are iterated until a particular number of 
iterations has been carried out or the improvement of 
the function C is less than a given threshold e. 

III. Approach Description 

In this section we describe our approach to merge social 
relationships with user ratings. We assume that users can 
create various type of social relationships (e.g., getting 
friends or affiliating to the same groups). We say that a 
social tie exists between two users if they created a social 

2 The Frobenius norm of a matrix A is defined as ||A[[p = TR (AA T ) 
being TR the trace of the matrix A, i.e., the sum of the elements located 
on its main diagonal. 



relationship. A social tie can be symmetric or asymmetric. To 
introduce our approach, we need the following definitions: 
Definition 1: Let U — {ui, it2, ■ ■ ■ , w n „} be a set of users 
and I = {ii, %2, ■ ■ ■ , i ni } be a set of items. A Social Rating 
Network (SRN) is a 4-tuple SRN = (U, I, </>, -0) where: 

• <f> : U X I — > R + U { ' V} is a function (called rating 
function) which associates a user u x S U and an item 
il S I with a real number r x i if u x rated i\ with r x i, 
and with the symbol "★" if u x has not rated 

• ip : U x U — > {0, 1} is a function (called social 
function) which takes a pair of users u x ,u y £ U as 
input and returns 1 if and only if a social tie exists 
between them and otherwise. 

Due to the definition of social tie, the function ip is 
generally asymmetric, i.e., ^p{u x , u v ) ^ ip(u y , u x ), for some 
pairs of users u x and u y . The function ip is also useful to 
define the concept of neighborhood of a user u x in an SRN: 

Definition 2: Let SRN = (U, I, <j>, ip) be a Social Rating 
Network and u x € U be a user. The neighborhood N x of 
u x is defined as the set of users having a social tie with u x 

N x = {u y e U : ip(u x ,u y ) = 1} 

The concept of SRN specializes the concept of social 
network; in fact, unlike traditional social networks, users 
are not only allowed to interact among each others but also 
to rate objects. To describe our approach, we start observing 
that Equation Q] provides an effective tool for tackling data 
sparsity but it is agnostic about social relationships because 
no terms related to social ties involving existing users appear 
in it. 

However, in reality, social relationships are a powerful 
tool for producing suggestions: for instance, if a person is 
uncertain about the purchase of a good, she often asks her 
friends/acquaintances opinions or advices that often play a 
crucial role in her final decision. 

Our goal is, therefore, to extend Equation Q] to the case of 
SRNs by adding terms capable of taking into account social 
ties among users. To simplify the discussion, henceforth 
we shall focus on friendship relations as models of social 
relationships; however, without any loss of generality, the 
conclusions we will draw are still valid for other type of 
social relationships. 

We start observing that, if we would solve the optimiza- 
tion problem in Equation Q] we would map any user u x onto 
a point u x in the latent space. We guess that, if two users 
are friends, then they should be mapped onto close points, 
in the latent space; in other words, given three users u x , u y 
and u z , such that only u x and u y are friends, the distance 
between the points and p y must be less than the distance 
between p x and p z . 

Such a requirement has an easy explanation: in fact, if 
a user u x is close to her friends in the latent space, the 
opinions of the friends of u x will be more relevant to 



recommend items to u x than the opinions of other users who 
are unknown to u x . From a mathematical standpoint, the 
distance between two users u x and u y is given by 1 1 p x — p y 1 1 , 
being || • || the euclidian norm in the latent space. If we 
denote as N x the neighborhood of u x , our aim is that the 
term J2 U eJV \\Px — Py\\ is as small as possible. 

These considerations suggest to add a penalty term to 
Equation]]] In detail, among the all possible mappings to the 
latent space, we decide to penalize those mappings in which 
users tied by a friendship relation are mapped onto "far" 
points. In the light of these considerations, the optimization 
problem to solve is as follows 

min C = mini||R-PQ T ||i + 

(P.Q) 2" 

nu (3) 

+A(||p||! + ||Q||!)+/x£ Hp--p.II 

x=l u v £N m 

As in the previous case, A and /i are two constants to be 
tuned to avoid overfitting. We applied the gradient descent 
algorithm to solve our optimization problem. In such a case, 
the partial derivatives of the objective function L are 

d£' n ' 

g = X] S » (Px • <Ji - R-xj) qj+\p x +H X (Px-Py) 

Px i=i u y eN x 

dC' ™" 

J— X S V ' P ' *b ~ R . >P + H 

° q 3 x=l 

IV. Experiments 

In this section we present the experiments we carried 
out to evaluate our approach. We built a social networks 
(called Cofe) in which users were allowed to rate movies. 
The early users of Cofe were students enrolled in a BsC 
degree in Computer Science at our University; after this, 
students were allowed to invite their friends to join Cofe, 
create friendship relationships with other members, insert 
movie titles and rate them. We gathered data on 37 students 
and 297 movie ratings. We used as metric to assess the 
quality of recommendations the Root Mean Square Error 
(RMSE); to define it, assume to randomly split the dataset in 
two parts called training and test set. The training set is used 
to perform matrix factorization whereas the test set is used to 
assess the quality of recommendations. If we define (i) r x j 
the rating u x assigned to ij, (ii) f X j the rating predicted by a 
given method for u x and ij and (Hi) N t the size of training 

set, the RMSE is defined as RMSE = yj± (r xj - f xj ) 2 . 
We compare our approach with the following methods: 

• Naive. In this method, the rating of an item is predicted 
as the average of the ratings provided by all users. 

* NMF. In this method, we solve the optimization prob- 
lem in Equation [T] to predict missing ratings. 
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RMSE OF Naive, NMF AND OA RESULTS FOR DIFFERENT VALUES OF fc 
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Table II 
Impact of fi on RMSE 



To run both NMF and our approach (hereafter, OA) we 
fixed, after a pre-tuning activity, A = 0.001. In addition, we 
fixed the number of iterations of both NMF and OA equal 
to 5000. 

In a first experiment we studied how the value of k (i.e., 
the number of dimensions in the latent space) impacted on 
the accuracy. We fixed k — {2,. ..,16} and ran both NMF 
and OA; of course, the Naive algorithm is not influenced by 
k. The obtained results are reported in Table U 

From the analysis of this table we can conclude that: 

- OA significantly outperforms the Naive method and is 
better than NMF. 

- OA and NMF achieve their best performance when k is 
around 10. In fact, if the number k of dimensions exploited 
to represent user preferences and movie features is too low, it 
is not possible to accurately capture the differences between 
two movies or two users and, therefore, it is hard to correctly 
recommend movies to users. 

By contrast, if k exceeds 10, the RMSE of both OA and 
NMF does not significantly decrease. This means that a 
number of latent factors equal to 10 is enough to correctly 
capture movie genres and a further increase of k is useless. 

However, the higher k, the larger the size of P and Q and, 
therefore, the larger the space of memory required to store 
them. These considerations indicate us that a good trade-off 
between space requirements and recommendation accuracy 
is achieved when k = 10. 

As a further experiment, we tuned the value of /i, i.e., we 
studied how the weight associated with social relationships 
reflect on the quality of suggestions. The RMSE value (when 
k = 10) and //, ranged from 10 -6 to 10 _1 , are reported 
in Table UD From Table HI] we observe that if /i — > 0, the 
term representing social relationship tends to vanish and, 



therefore, OA degenerates into NMF (i.e., only user ratings 
are considered). By contrast, if fj, — >■ 10 _1 , an opposite effect 
arises: information on social relationships dominates over 
user ratings. The best value of RMSE is achieved if n is 
around 10 -3 because, in such a configuration, our approach 
takes advantage of both user ratings and social relationships. 

V. Related Works 

In this section we describe some approaches related to 
our research. In detail, we first consider Matrix Factorization 
(MF) techniques and describe how they have been applied 
in the context of Collaborative Filtering (CF); after this, we 
highlight the main novelties introduced by our approach. 
Then, we focus on Trust-Based Collaborative Filtering 
Systems, i.e., on CF systems in which users are allowed 
to explicitly declare if they trust (and sometimes distrust) 
other users. We explain the differences between trust-based 
CF systems and our research efforts. 

A. Matrix Factorization Techniques for CF 

The notion of Non-Negative Matrix Factorization (in 
short, MF) was introduced for the first time in the seminal 
paper of Lee and Seung in 1999 |2|. MF techniques have 
been widely applied in multiple domains like data clustering 
|[l"3l and bioinformatics [fl4l . 

The algorithm proposed in Q to perform matrix factor- 
ization is iterative and it strongly resembles the gradient 
descent method discussed in this paper. 

One of the first approaches that exploited MF in the 
context of Collaborative Filtering is lfT31 . In that paper, the 
authors observe that the task of factorizing two matrices is 
computationally challenging and, therefore, they propose a 
strategy called Alternating Least Square (ALS). 

ALS-based techniques proceed iteratively and each itera- 
tion consists of two stages: in the first stage the matrix P 
is fixed and the problem described in Equation Q] is solved 
with respect to the matrix Q. In the second stage, the vice 
versa. In both the two stages, the optimization problem can 
be reformulated (and solved) as a least square problem fl5l . 
|4|. ALS techniques lend themselves to a massive amount of 
parallelization. Therefore, the growing success of distributed 
computing platform like HADOOP, envisages a big success 
of these techniques. 

As further extensions, the approach of Q suggests to 
add bias terms in Equation [U these terms model the fact 
that some users tend to provide more generous ratings than 
others or that some items tend to receive ratings generally 
higher than other items. In the same paper, the authors 
study how temporal changes in user preferences and/or in 
item properties (e.g., the fact that the popularity of an item 
may decay over time) impact on the process of generating 
recommendations. 

Some approaches target at learning the latent space in 
presence of implicit user feedbacks 0. In this scenario, 



users do not directly provide ratings on items but the analysis 
of user behaviours is useful to understand user preferences. 
For instance, in an e-commerce Web site, we can assume that 
a user likes an item if she bought it. Finally, some authors 
suggest to incorporate in Equation Q] also demographic 
variables like the gender or the age of the users. 

Differently from the works above, in our approach, the 
task of learning the latent space relies on information tied 
to both user ratings and their social relationships. In other 
words, we merge information about social relationships and 
user behaviours (encoded as the ratings they provided) in a 
unique framework to produce recommendations. 

B. Trust-based CF systems 

The usage of trust information to generate recommenda- 
tions has been largely explored in the literature. 

Some approaches suggest to use trust values between 
pairs of users in conjunction with Collaborative Filtering 
techniques to produce recommendations fl6l . ifTTl . Q. 

The key challenge in trust-based approaches is represented 
by the computation of trust values. In fact, in most cases, 
users provide few explicit declarations of trust; this implies 
that trust values among many pairs of users are unknown 
and, therefore, a mechanism for inferring new trust values 
from existing ones must be designed. 

Some approaches assume that existing users form a com- 
munity which is modeled through a graph G whose nodes 
represent users and edges indicate relationships between 
them (8], lfl8l . In J8J the authors apply a modified version 
of the Breadth First Search algorithm on G to infer multiple 
values of trust for each user; the average of these values is 
computed to produce a final trust value. 

The approach of ifTHl considers paths up to a fixed length 
in G and propagates trust values explicitly declared by users 
on them to infer new ones. 

Recently, some approaches studied the problem of prop- 
agating trust in signed social networks, i.e., social networks 
in which ties between users may be either positive (indi- 
cating, for instance, that two users are friends) or negative 
(indicating a relationship like antagonism) lfl9l . [20|. These 
approaches are grounded in balance theory introduced in 
social psychology. Roughly speaking, balance theory is 
based on principles like the enemy of my friend is my enemy 
and the friend of my enemy is my enemy. 

Differently from the approaches described above, our 
approach is based on social relationships. This has two 
major effects: (i) Social relationships are weaker than trust 
ones: in fact, in a social network two users may interact for 
different reasons (e.g., to be informed on a new topic). By 
contrast, if a user u x trusts a user u y , this means that u x has 
a reasonable expectation that interactions with u y will be 
beneficial for her. (ii) Trust Relationships are asymmetric. 
On the contrary, social relationship can be asymmetric as 
well as symmetric. 



During the latest years, the possibility of exploiting the 
social relationships together with the rating behaviors has 
been advanced |2T1 . Il22l . In ||2T1 the authors suggest that 
drawing on similarity and familiarity between the users in 
their rating activities could support the decision making and 
increase recommendation quality. In ll22l authors propose a 
model that combines social ties and ratings to improve the 
movie recommendation quality in the context of a real-world 
social rating network, providing encouraging results. With 
this paper we additionally substantiate the hypothesis that 
combining Collaborative Filtering and social relationships 
is helpful in order to build better Recommender Systems. 

VI. Conclusions 

In this work we presented a novel strategy to provide 
heightened quality recommendations to users, in the context 
of Social Rating Networks, those Social Networks in which 
users are allowed to socially interact and to rate items. 
Our approach relies both on user ratings and on social 
relationships among users and merges this information by 
means of Matrix Factorization techniques. Experimental 
results show the effectiveness of our approach. 

We plan to extend our approach to Trusted Social Net- 
works, i.e., social networks in which users are allowed to 
define positive or negative trust in other users. Another issue 
to explore is related to the scalability of our approach. In de- 
tail, we plan to design and implement efficient (and possibly 
distributed) algorithms to perform Matrix Factorization. 
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